{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a href=\"https://colab.research.google.com/github/jermwatt/machine_learning_refined/blob/main/notes/16_Linear_algebra/16_3_Matrices.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Chapter 16: Elements of linear algebra"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This notebook contains interactive content from an early draft of the university textbook <a href=\"https://github.com/neonwatty/machine-learning-refined/tree/main\">\n",
    "Machine Learning Refined (2nd edition) </a>.\n",
    "\n",
    "The final draft significantly expands on this content and is available for <a href=\"https://github.com/neonwatty/machine-learning-refined/tree/main/chapter_pdfs\"> download as a PDF here</a>."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# C.3 Matrices and Matrix Operations"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this Section we introduce the concept of a matrix (also sometimes referred to as an array) as well as the basic operations one can perform on a single matrix or pairs of matrices.  These completely mirror those of the vector, including the transpose operation, addition/subtraction, and several multiplication operations including the inner, outer, and element-wise products.  Because of the close similarity to vectors this Section is much more terse than the previous. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "# install github clone - allows for easy cloning of subdirectories\n",
    "!pip install github-clone\n",
    "from pathlib import Path \n",
    "\n",
    "# clone library subdirectory\n",
    "if not Path('chapter_16_library').is_dir():\n",
    "    !ghclone https://github.com/neonwatty/machine-learning-refined-notes-assets/tree/main/notes/16_Linear_algebra/chapter_16_library\n",
    "else:\n",
    "    print('chapter_16_library already cloned!')\n",
    "\n",
    "        \n",
    "# append path for local library, data, and image import\n",
    "import sys\n",
    "sys.path.append('./chapter_16_library') \n",
    "\n",
    "# backend file\n",
    "from linear_algebra_library import vector_plots, transform_animators\n",
    "\n",
    "# standard imports\n",
    "from matplotlib.axes._axes import _log as matplotlib_axes_logger\n",
    "matplotlib_axes_logger.setLevel('ERROR')\n",
    "import matplotlib.pyplot as plt\n",
    "from IPython.display import Image, HTML\n",
    "import autograd.numpy as np\n",
    "\n",
    "# this is needed to compensate for matplotlib notebook's tendancy to blow up images when plotted inline\n",
    "%matplotlib inline\n",
    "from matplotlib import rcParams\n",
    "rcParams['figure.autolayout'] = True\n",
    "\n",
    "%load_ext autoreload\n",
    "%autoreload 2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##  The Matrix"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If we take a set of $P$ row vectors - each of dimension $1\\times N$ "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$\\mathbf{x}_{1}=\\left[\\begin{array}{cccc}\n",
    "x_{11} & x_{12} & \\cdots & x_{1N}\\end{array}\\right]$$\n",
    "\n",
    "$$\\mathbf{x}_{2}=\\left[\\begin{array}{cccc}\n",
    "x_{21} & x_{22} & \\cdots & x_{2N}\\end{array}\\right]$$\n",
    "\n",
    "$$\\vdots$$\n",
    "\n",
    "$$\\mathbf{x}_{P}=\\left[\\begin{array}{cccc}\n",
    "x_{P1} & x_{P2} & \\cdots & x_{PN}\\end{array}\\right]$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "and stack them one-by-one on top of each other we form a *matrix* of dimension $P\\times N$\n",
    "\n",
    "\n",
    "$$\n",
    "\\mathbf{X}= \\begin{bmatrix}\n",
    "x_{11} & x_{12} & \\cdots & x_{1N}\\\\\n",
    "x_{21} & x_{22} & \\cdots & x_{2N}\\\\\n",
    "\\vdots & \\vdots & \\ddots & \\vdots\\\\\n",
    "x_{P1} & x_{P2} & \\cdots & x_{PN}\n",
    "\\end{bmatrix}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In interpreting the dimension $P\\times N$ the first number $P$ is the number of rows in the matrix, with the second number $N$ denoting the number of columns.\n",
    "\n",
    "The notation we use to describe a matrix is a bold uppercase letter, as with $\\mathbf{X}$ above.  Like the vector notation nothing about the dimensions of the matrix is detailed by its notation - we need to explicitly state these."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The *transpose* operation we originally saw for vectors is defined by extension for matrices.  When performed on a matrix the transpose operation flips the entire array around - every column is turned into a row, and then these rows are stacked one on top of the other forming a $N\\times P$ matrix.  The same notation used previously for vectors - a superscript $T$ - is used to denote the transpose of a matrix\n",
    "\n",
    "$$\n",
    "\\mathbf{X} ^T= \\begin{bmatrix}\n",
    "x_{11} & x_{21} & \\cdots & x_{P1}\\\\\n",
    "x_{12} & x_{22} & \\cdots & x_{P2}\\\\\n",
    "\\vdots & \\vdots & \\ddots & \\vdots\\\\\n",
    "x_{1N} & x_{2N} & \\cdots & x_{PN}\n",
    "\\end{bmatrix}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In numpy we define matrices just as we do with arrays, and the same notation is used to transpose the matrix.  We illustrate this with an example in the next  Python cell."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "----- the matrix X -----\n",
      "[[1 3 1]\n",
      " [2 5 1]]\n",
      "----- the transpose matrix X^T -----\n",
      "[[1 2]\n",
      " [3 5]\n",
      " [1 1]]\n"
     ]
    }
   ],
   "source": [
    "# create a 2x3 matrix\n",
    "X = np.array([[1,3,1],[2,5,1]])\n",
    "print ('----- the matrix X -----')\n",
    "print (X) \n",
    "\n",
    "# transpose the matrix\n",
    "print ('----- the transpose matrix X^T -----')\n",
    "print (X.T) "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##  Addition and subtraction of matrices"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Addition and subtraction of matrices is performed element-wise, just as with vectors.  As with vectors two matrices must have the same dimensions in order to perform addition/subtraction on two matrices.  For example with two $P\\times N$ matrices\n",
    "\n",
    "$$\n",
    "\\mathbf{X}=\\begin{bmatrix}\n",
    "x_{11} & x_{12} & \\cdots & x_{1N}\\\\\n",
    "x_{21} & x_{22} & \\cdots & x_{2N}\\\\\n",
    "\\vdots & \\vdots & \\ddots & \\vdots\\\\\n",
    "x_{P1} & x_{P2} & \\cdots & x_{PN}\n",
    "\\end{bmatrix} \\,\\,\\,\\,\\,\\,\\,\\,\\,  \\mathbf{Y}=\\begin{bmatrix}\n",
    "y_{11} & y_{12} & \\cdots & y_{1N}\\\\\n",
    "y_{21} & y_{22} & \\cdots & y_{2N}\\\\\n",
    "\\vdots & \\vdots & \\ddots & \\vdots\\\\\n",
    "y_{P1} & y_{P2} & \\cdots & y_{PN}\n",
    "\\end{bmatrix}\n",
    "$$\n",
    "\n",
    "their element-wise sum is\n",
    "\n",
    "$$\n",
    "\\mathbf{X}+\\mathbf{Y}=\\begin{bmatrix}\n",
    "x_{11}+y_{11} & x_{12}+y_{12} & \\cdots & x_{1N}+y_{1N}\\\\\n",
    "x_{21}+y_{21} & x_{22}+y_{22} & \\cdots & x_{2N}+y_{2N}\\\\\n",
    "\\vdots & \\vdots & \\ddots & \\vdots\\\\\n",
    "x_{P1}+y_{P1} & x_{P2}+y_{P2} & \\cdots & x_{PN}+y_{PN}\n",
    "\\end{bmatrix}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Addition / subtraction of matrices using numpy is very done precisely as with vectors - with numpy both are referred to as *arrays*."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "----- the matrix X -----\n",
      "[[1 3 1]\n",
      " [2 5 1]]\n",
      "----- the matrix Y -----\n",
      "[[ 5  9 14]\n",
      " [ 1  2  1]]\n",
      "----- the matrix X + Y -----\n",
      "[[ 6 12 15]\n",
      " [ 3  7  2]]\n"
     ]
    }
   ],
   "source": [
    "# create two matrices\n",
    "X = np.array([[1,3,1],[2,5,1]])\n",
    "Y = np.array([[5,9,14],[1,2,1]])\n",
    "print ('----- the matrix X -----')\n",
    "print (X) \n",
    "print ('----- the matrix Y -----')\n",
    "print (Y)\n",
    "\n",
    "# add  matrices\n",
    "print ('----- the matrix X + Y -----')\n",
    "print (X + Y)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##  Multiplication"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Multiplication by a scalar"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As with vectors we can multiply a matrix by a scalar - and this operation is performed element-by-element.  For any scalar value $c$ we write scalar multiplication as \n",
    "\n",
    "$$\n",
    "c\\times\\mathbf{X}=\\begin{bmatrix}\n",
    "c\\times x_{11} & c\\times x_{12} & \\cdots & c\\times x_{1N}\\\\\n",
    "c\\times x_{21} & c\\times x_{22} & \\cdots & c\\times x_{2N}\\\\\n",
    "\\vdots & \\vdots & \\ddots & \\vdots\\\\\n",
    "c\\times x_{P1} & c\\times x_{P2} & \\cdots & c\\times x_{PN}\n",
    "\\end{bmatrix}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In numpy scalar multiplication can be written very naturally using the '*' symbol, as illustrated in the next Python cell."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[ 2  6  2]\n",
      " [ 4 10  2]]\n"
     ]
    }
   ],
   "source": [
    "# define a matrix\n",
    "X = np.array([[1,3,1],[2,5,1]])\n",
    "c = 2\n",
    "print (c*X)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Multiplication of a matrix by a vector"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Generally speaking there are two ways to multiply an $P\\times N$ matrix $\\mathbf{X}$ by a vector $\\mathbf{a}$.  The first - referred to as *left multiplication* - involves multiplication by $1\\times P$ row vector $\\mathbf{a}$.  This operation is written $\\mathbf{a}\\mathbf{X} = \\mathbf{b}$, with $\\mathbf{b}$ being a $1\\times N$ dimensional vector.  It is defined by taking the inner product of $\\mathbf{a}$ with each column of $\\mathbf{X}$.\n",
    "\n",
    "$$\n",
    "\\mathbf{a}\\mathbf{X} = \\mathbf{b} = \n",
    "\\begin{bmatrix}\n",
    "\\sum_{p=1}^P a_px_{p1} \\,\\,\\,\\,\\,\n",
    "\\sum_{p=1}^P a_px_{p2} \\,\\,\\,\\,\\, \n",
    "\\cdots \\,\\,\\,\\,\\,\n",
    "\\sum_{p=1}^P a_px_{pN} \n",
    "\\end{bmatrix}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Since this multiplication consists of a sequence of inner products, we can use the inner or dot product notation in numpy to compute a left multiplication as illustrated in the next cell."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[3 8 2]]\n"
     ]
    }
   ],
   "source": [
    "# define a matrix\n",
    "X = np.array([[1,3,1],[2,5,1]])\n",
    "a = np.array([1,1])\n",
    "a.shape = (1,2)\n",
    "\n",
    "# compute a left multiplication\n",
    "print (np.dot(a,X))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Right multiplication is defined by multiplying $\\mathbf{X}$ on the right by a $N\\times 1$ vector $\\mathbf{a}$.  Right multiplication is written as $\\mathbf{X}\\mathbf{a} = \\mathbf{b}$ and $\\mathbf{b}$ will be a $P\\times 1$ vector.  The right product is defined as \n",
    "\n",
    "$$\n",
    "\\mathbf{X}\\mathbf{a} = \\mathbf{b} = \n",
    "\\begin{bmatrix}\n",
    "\\sum_{n=1}^N a_nx_{1n} \\\\\n",
    "\\sum_{n=1}^N a_nx_{2n} \\\\ \n",
    "\\vdots \\\\\n",
    "\\sum_{n=1}^N a_nx_{Pn} \n",
    "\\end{bmatrix}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Since the right multiplication also consists of a sequence of inner products, we can use the inner or dot product notation in numpy to compute a right multiplication as illustrated in the next cell."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[5]\n",
      " [8]]\n"
     ]
    }
   ],
   "source": [
    "# define a matrix\n",
    "X = np.array([[1,3,1],[2,5,1]])\n",
    "a = np.array([1,1,1])\n",
    "a.shape = (3,1)\n",
    "\n",
    "# compute a right multiplication\n",
    "print (np.dot(X,a))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "### Element-wise multiplication of two matrices"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As with vectors, we can define element-wise multiplication on two matrices of the same size.  Multiplying two $P\\times N$ matrices $\\mathbf{x}$ and $\\mathbf{y}$ together gives"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$\n",
    "\\mathbf{X}\\times \\mathbf{Y}=\\begin{bmatrix}\n",
    "x_{11}\\times y_{11} & x_{12}\\times y_{12} & \\cdots & x_{1N}+y_{1N}\\\\\n",
    "x_{21}\\times y_{21} & x_{22}\\times y_{22} & \\cdots & x_{2N}+y_{2N}\\\\\n",
    "\\vdots & \\vdots & \\ddots & \\vdots\\\\\n",
    "x_{P1}\\times y_{P1} & x_{P2}\\times y_{P2} & \\cdots & x_{PN}+y_{PN}\n",
    "\\end{bmatrix}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This can be easily computed in numpy, as illustrated in the next Python cell for two small example matrices."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "----- the matrix X -----\n",
      "[[1 3 1]\n",
      " [2 5 1]]\n",
      "----- the matrix Y -----\n",
      "[[ 5  9 14]\n",
      " [ 1  2  1]]\n",
      "----- the matrix X * Y -----\n",
      "[[ 5 27 14]\n",
      " [ 2 10  1]]\n"
     ]
    }
   ],
   "source": [
    "# create two matrices\n",
    "X = np.array([[1,3,1],[2,5,1]])\n",
    "Y = np.array([[5,9,14],[1,2,1]])\n",
    "print ('----- the matrix X -----')\n",
    "print (X) \n",
    "print ('----- the matrix Y -----')\n",
    "print (Y)\n",
    "\n",
    "# add  matrices\n",
    "print ('----- the matrix X * Y -----')\n",
    "print (X*Y)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### General multiplication of two matrices"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The regular product (or simply product) of two matrices $\\mathbf{X}$ and $\\mathbf{Y}$ can be defined based on the vector outer product operation, provided that the number of columns in $\\mathbf{X}$ matches the number of rows in $\\mathbf{Y}$. That is, we must have $\\mathbf{X}$ and $\\mathbf{Y}$ of sizes $P\\times N$ and $N \\times Q$ respectively, for the matrix product to be defined as \n",
    "\n",
    "$$\\mathbf{XY}= \\sum_{n=1}^N \\mathbf{x}_{n}\\mathbf{y}_{n}^{T}$$\n",
    "\n",
    "where $\\mathbf{x}_{n}$ is the $n^{th}$ column of $\\mathbf{X}$, and $\\mathbf{y}_{n}^{T}$ is the transpose of the $n^{th}$ column of $\\mathbf{Y}^{T}$ (or equivalently, the $n^{th}$ row of $\\mathbf{Y}$). Note that each summand above is an outer-product matrix of size $P \\times Q$, and so too is the final matrix $\\mathbf{XY}$.\n",
    "\n",
    "Matrix multplication can also be defined entry-wise, using vector inner-products, where the entry in the $p^{th}$ row and $q^{th}$ column of $\\mathbf{XY}$ can be found as the inner-product of (transpose of) the $p^{th}$ row in $\\mathbf{X}$ and the $q^{th}$ column in $\\mathbf{Y}$.\n",
    "\n",
    "$$\\left(\\mathbf{XY}\\right)_{p,q}= \\mathbf{x}_{p}^{T}\\mathbf{y}_{q}$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "----- the matrix X -----\n",
      "[[1 3 1]\n",
      " [2 5 1]]\n",
      "----- the matrix Y -----\n",
      "[[ 5  1]\n",
      " [ 9  2]\n",
      " [14  1]]\n",
      "----- the matrix XY -----\n",
      "[[46  8]\n",
      " [69 13]]\n"
     ]
    }
   ],
   "source": [
    "# create two matrices\n",
    "X = np.array([[1,3,1],[2,5,1]])\n",
    "Y = np.array([[5,9,14],[1,2,1]])\n",
    "Y = np.array([[5,1],[9,2],[14,1]])\n",
    "print ('----- the matrix X -----')\n",
    "print (X) \n",
    "print ('----- the matrix Y -----')\n",
    "print (Y)\n",
    "\n",
    "# add  matrices\n",
    "print ('----- the matrix XY -----')\n",
    "print (np.dot(X,Y))"
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.10"
  },
  "toc": {
   "colors": {
    "hover_highlight": "#DAA520",
    "navigate_num": "#000000",
    "navigate_text": "#333333",
    "running_highlight": "#FF0000",
    "selected_highlight": "#FFD700",
    "sidebar_border": "#EEEEEE",
    "wrapper_background": "#FFFFFF"
   },
   "moveMenuLeft": true,
   "nav_menu": {
    "height": "509px",
    "width": "252px"
   },
   "navigate_menu": true,
   "number_sections": false,
   "sideBar": true,
   "threshold": 4,
   "toc_cell": false,
   "toc_section_display": "block",
   "toc_window_display": false,
   "widenNotebook": false
  },
  "widgets": {
   "application/vnd.jupyter.widget-state+json": {
    "state": {},
    "version_major": 1,
    "version_minor": 0
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
